Automated Presentation of directory src/exampleCode/audio/sonic/

HUB | Up | Download | Pheedbak | Tree | Topic | A-Z | Search | Hot | New


Please be aware: what appears below are the v4.2 DT bits in auto-generated html form.
As we have the time, we will update these to reflect the current "state of the world".


README file from "sonic" directory

             ~4Dgifts/toolbox/src/exampleCode/audio/sonic README

     The contents of this README is an article that originally appeared in 
     the March/April 1993 Issue of the Silicon Graphics support magazine 
     "Pipeline", Volume 4, Number 2.  It is the companion to the sonic.c 
     program example.  



              ADDING AUDIO TO AN EXISTING GRAPHICS APPLICATION.

   
   This article will discuss some of the common issues and techniques
   programmers encounter when adding audio to an existing graphics 
   application.  Here, we'll look at an example program, sonic.c,
   that illustrates some of the techniques you will want to use -
   spawning a separate audio process, initializing the audio port,
   reading audio files, playing one-shot vs. continuous audio, 
   mixing sounds, using stereo audio effectively in 3D programs, and
   dealing with CPU demands of an audio process that cause undesirable
   clicking.
   
   
   HARDWARE AND SOFTWARE REQUIREMENTS
   
   Currently, under version 4.0.5F of IRIX, audio development can be done 
   on any Iris Indigo platform - R3000 or R4000, from Entry to XS to
   Elan graphics.  This class of machine has standard digital audio 
   hardware, and is able to run the Digital Media Development Option, 
   for which this example program was written.
   
   The Digital Media Development Option is a relatively new
   software package which consists of Application Programmers'
   Interfaces (API's) for digital audio (libaudio), the starter 
   video (libsvideo), MIDI (libmidi), the AIFF-C audio file 
   (libaudiofile), CD and DAT drive audio access (libcdaudio and 
   libdataudio).  This package comes with the Digital Media 
   Programmers Guide to aid in learning about the new libraries. 
   The Digital Media Development Option requires version 4.0.5F 
   of the operating system and will run on any Iris Indigo system.
   
   Note: Some 4D35's have the ability to produce the same quality
   sound as that of the Indigo. Programming audio on the 4D35's
   at version 4.0.5A or earlier of the operating system uses
   the same Audio Library that the Indigo uses.  The Digital
   Media Development Option, unfortunately, will currently not run 
   on a 4D35 system, as IRIX version 4.0.5F is not qualified for 
   installation on 4D35's.
   
   For this programming example, you'll need audiodev.sw from the 
   Digital Media Development Option, which contains the libraries 
   for the Audio Library (AL) and the Audio File Library (AF).  
   These subsystems comprise the Audio Development Environment, 
   version 2.0 and are provided with the Digital Media Development 
   Option.  (These are the successors to the Audio Development 
   Environment originally provided with the Iris Development Option, 
   in version 4.0.1 of the operating system.)  
   
   This example also makes use of some of the Prosonus sound files
   found in the /usr/lib/sounds/prosonus directory.  These audio
   files originate from the audio.data.sounds subsystem from the
   4.0.5F Update to the operating system.  Check the output of the 
   IRIX command versions to make sure these subsystems are installed 
   on your machine.
   
   
   ABOUT THE EXISTING GRAPHICS PROGRAM
   
   The structure of the graphics program which sonic is based on is
   typical of Graphics Library (GL) programs and lends itself easily
   for conversion to mixed-model programming.  Windowing and input
   routines are handled in the main body of the program. GL
   routines that need to be called once are handled in the sceneinit() 
   function, rendering is handled by the drawscene() function, and 
   animation is handled by the movesphere() function. 
   
   The program itself models a stationary observer inside a colored 
   cube with a sphere that flies around the inside of the cube, 
   bouncing off walls.  The observer can rotate (but not translate) 
   the camera's viewpoint by moving the mouse to see different views from 
   inside the cube.  The left mouse button starts and stops the motion 
   of the sphere.
   
   We'll be adding two different sounds to create a stereo audio
   environment for the application.
   
   The first sound will be an example of continuous audio - a sound the 
   sphere continuously makes as it moves.  This is analogous to the 
   constant engine noise of a plane.  As the sphere gets closer to the 
   observer, its intensity increases or decreases, depending on the 
   distance from the observer.  As the observer's viewpoint rotates, the 
   audio "location" (as perceived through headphones) will also change; 
   that is, as the sphere passes the observer from right to left, so 
   will the sound of the sphere.  
   
   The second sound will be a one-shot sound - a sound the sphere
   makes under the condition of it's direction change, either from
   bouncing off a wall, or the user toggling the sphere's motion
   with the left mouse button.  This is analogous to the sound of 
   a missile hitting a target.  This sound is also effected by the 
   orientation of the observer and distance from the sphere's event.
   
   
   AUDIO LIBRARY BASICS
   
   The Audio Library (AL) itself divides the task of sending audio
   samples to the hardware into two main areas of control - 
   devices and ports. 
   
   The audio device controls the input/output volume, input source, 
   input/output sampling rate, etc. AL functions exist to control
   the default audio device, however, it is considered "polite" audio
   etiquette to let the user control these via apanel.  Apanel itself
   is just an AL program that controls the default audio device with 
   a graphical user interface.  It is possible for a program that
   asserts its own audio device parameters to modify another audio 
   program's device settings.  The Indigo's default audio device supplies
   up to four audio ports, either stereo or mono, for AL applications 
   to use.
   
   An audio port is an entity that an AL program reads samples from
   or writes to the audio device.  A port can be thought of as a queue of 
   sound samples, where the AL programmer has control over only one end 
   of the queue.  Thus, a program that received audio input would read 
   samples in from one end of this queue, with the audio device supplying 
   the samples from an input source such as a microphone.  Conversely, 
   a program that generated audio output would supply data for one 
   end of the queue, and the audio device would send the queued
   samples to the audio outputs, such as the Indigo's speaker, or a set 
   of headphones. An audio program that did both audio input and audio 
   output would require the use of two audio ports and associated queues.
   We'll discuss the size of this queue a little later.
   
   A port can be configured to utilize an audio sample datatype that 
   best suits the application with the ALsetsampfmt() function. 
   Sample data can be represented by a 2's complement integer or single 
   precision floating point.  Integer data can be 8, 16, or 24 bits wide.  
   Sample width is controlled by the ALsetwidth() command.  Sample width 
   does not represent the maximum amplitude of the input or output of an 
   audio signal coming into or out of the audio jacks.  If this were true, 
   one could incorrectly imply that an audio port with a sample width of 
   24 could have a louder dynamic range than an audio port of width 8.  
   Instead, sample width represents the degree of precision to which the 
   full scale range of an input or output signal will be sampled.  That 
   is, if the maximum value for an 8-bit sample is 127, or 2^7 - 1, the 
   signal level represented by this sample could also be represented by a 
   16-bit sample whose value is 32767, or 2^15 - 1, or a 24 bit sample 
   whose value is 2^23 -1.  For floating point data, sample width is 
   always the same, but having MAXFLOAT as the maximum amplitude is often 
   impractical.  The AL function ALsetfloatmax() allows the programmer 
   to specify an appropriate maximum value for their own data when the 
   sample format is designated to be floats.  Dynamic range of the data is 
   required to be symmetrical and centered around the value 0, so the 
   absolute value of the minimum amplitude value is always equal to the 
   maximum amplitude.
   
   Ports can be configured to accept either stereo or monaural sample 
   streams with the ALsetchannels() call.  Stereo sample streams are 
   implemented as interleaved left-right pairs, where even numbered s
   amples represent the left channel and odd numbered samples represent 
   the right channel.  As one might expect, a stereo sample buffer will 
   be twice as big as a mono sample buffer 
   
   
             Array index    0   1   2   3   4   5
                          -------------------------
   Audio Sample Array     | L | R | L | R | L | R | ... 
                          -------------------------
                            \  /
                             \/
                             A stereo sample pair to be input 
                             or output simultaneously
   
   
   CREATING A SEPARATE PROCESS FOR AUDIO
   
   It's nice to be able to keep audio and graphics separate. 
   For this example program, the audio we're producing is in reaction
   to events being handled by the section of the program responsible
   for graphics.  The graphics process controls the motion and position 
   of the sphere, as well as the orientation of the observer.  These are
   all aspects we'd like our audio display to reflect, but not control.  
   Creating a completely separate process to handle the audio has one 
   main benefit - it provides enough independence of the audio aspects 
   of the application so that audio performance is not degraded when the
   controlling process contends for graphics resources.  This independence 
   can be achieved with the sproc() function and can be enhanced by 
   raising the priority of the audio process through IRIX scheduling 
   control which will be discussed later.
   
   
   The sproc() function is the mechanism for spawning a separate audio 
   child process from our parent process which handles the graphics.  
   Sproc() is nice and easy for our purposes.  It says, "Create a separate 
   process and have it start out by executing the function I tell you."   
   The original process will continue merrily on its way.
   Besides a starting-point function, sproc() takes another argument
   which tells how the parent and child processes should share data.
   The sonic program starts the newly created child process at
   the audioloop() function.  The PR_SALL argument that sonic uses
   tells the parent and child to share nearly everything.  We're
   mostly interested that the parent and child processes share virtual 
   address space and that the data in this address space is consistent 
   between them.  This means that the audio process will get to look at 
   how the graphics process changes the values of the "interesting" variables. 
   This also means that if either the graphics process or the audio 
   process change the value of a variable, the other will know about it 
   immediately.  Having the variables shared becomes the mechanism of 
   communication between the two processes.  See the man page for sproc() 
   for intimate details.
   
   In general, it is not recommended that two separate processes both 
   call GL functions pertaining to the same connection to the graphics 
   pipe.  To avoid encouraging graphics calls within the audio process,
   and to establish a logical separation of the graphics and audio 
   processes, the sproc() is called before winopen().
   
   
   INITIALIZATION
   
   The main task of any AL program is to read or write audio data to or 
   from an audio port fast enough so that the user percieves the desired 
   audio effect without interruption, i.e. the sample queue is never
   completely empty.  Any AL program that performs audio processing for 
   output will have a code structure that looks something like the pseudo-
   code below.  The elements of the pseudo-code can be seen in sonic.c. 
   
   
   #include <audio.h>
   
   ALport audioport;
   ALconfig audioconfig;
   
   /* Audio initializiation */
   audioconfig = ALnewconfig();    /* New config structure */
   
   ...    /* Set up configutation */
          /* of audio port.       */
   
   ALsetsampfmt( audioconfig, AL_SAMPFMT_FLOAT);
   
   ...
   
   audioport = ALopenport("port name","w",audioconfig);   
       /* Open audio port */
   
   /* Audio main loop */
   while( ! done ) {
       process_audio();            /* Compute samples */
       ALwritesamps(audioport, samplebuffer, bufferlength);
               /* Output samples to port */
   }
   
   /* Audio shut down */
   ALfreeconfig(audioconfig);
   ALcloseport(audioport);             /* Close audio port */
   
   
   Notice that port configuration information is put into the Audio 
   Library structure ALconfig which is then passed as a parameter to the 
   ALopenport() function.  This implies that if we wish to change to 
   or mix different sample formats of our data, or any other aspect
   of the audio port's configuration, we will either need to open 
   another audio port or convert all sample data to one common format.
   
   Choosing the size of the sample queue for the configuration of
   the audio port is very important in applications such as this
   where audio dynamics are constantly changing.  The AL function
   ALsetqueuesize() provides the means of control here.  The Audio
   Library currently allows the minimum queue size to be 1024 samples
   (or 512 stereo sample pairs) for the floating point audio port we're 
   using in sonic.  It is not a bad idea to set the size of your sample 
   queue to be about twice that of the number of samples you are 
   processing.  This gives some leeway for audio processing time to 
   take a little longer than expected if the audio device occasionally 
   drains the audio sample queue too fast, but also provides room enough 
   to send a fresh batch of samples if the queue is draining too slow.
   However, it is possible for audio latency to increase with a queue 
   larger than needed.  Stereo sample queues need to be kept at even 
   lengths so that the proper sense of stereo separation will not be 
   switched for stereo sample pairs in every second call to ALwritesamps().
   
   Try changing the #define BUFFERSIZE to 512.  Recompile and run the
   sonic program.  You might notice that, in general, the audio sounds
   scratchier.  This can be because the actual processing of the
   audio is taking longer than the hardware's playing of the audio.
   In other words, it's possible for the sample queue to be
   emptied faster than it's filled.  A small sample buffer may
   provide lower latency between updates to the output sample stream,
   but you need to keep the output continuous to keep it from producing
   undesirable clicks.  On the other hand, a larger sample buffer
   will increase latency and is apt to detract from the overall
   audio experience.  Change the BUFFERSIZE to 44100 to hear the
   effects of high latency.  Notice the stereo placement of the
   sphere seems choppier.  A BUFFERSIZE of 4000 seems to do a good
   job for sonic - enough to keep the audio process busy, without
   detracting from the user's interaction with the application. 
   You'll have to find your own happy medium for your own application.  
   (Don't forget to change BUFFERSIZE back!)
   
   Thirty-two bit floats are used in this example as the base sample 
   format to which all sample data will be converted.  They provide a 
   convenient means for multiplication without type casting in the time 
   critical sections of the code that do the actual sound processing.  
   Also, specifying a maximum amplitude value of 1.0 can provide for a 
   handy normalization of all sound data, especially if some waveforms 
   are to effect the behavior of other waveforms (like an envelope 
   function).
   
   
   READING AUDIO FILES
   
   The sounds used in sonic are read in from the routine init_sound().
   It uses the Audio File Library (AF) to read AIFF (.aiff suffix)
   and AIFF-C (.aifc suffix) audio files.  To provide a common
   ground for some nice sounds, sonic uses sounds from the 
   /usr/lib/sounds/prosonus directory.  You should be able
   to change which sounds are used by editting the BALLFILENAME
   and WALLFILENAME #defines.  
   
   The AF has two main structures to deal with. An audio file
   setup, which is used mainly for writing audio files, is needed
   to get an audio file handle with the AFopenfile() routine.
   For more information on audio file setups, see the Digital
   Audio and MIDI Programming Guide that comes with the Digital
   Media Development Option.  
   
   An audio file contains a lot of information about the data inside.
   Sample format, sample width, number of channels, number of sample 
   frames, as well as the sound data itself, are about all this example 
   needs.  It is possible to get information from an AIFF-C file that 
   describes looping points, pitch and suggested playback rates (if they 
   are provided in the file).  See the Audio Interchange File Format
   AIFF-C specifications and the Digital Audio and MIDI Programming 
   Guide for more details on what can be stored in an AIFF-C file.
   
   When reading in audio files into your program you may find it
   necessary to convert the sample data into a format that's
   better suited to your application.  Most of the prosonus
   sounds are in 16-bit 2's complement format.  Any file that is 
   not in this format produces an error message, as an appropriate 
   conversion to floating point for other formats was not implemented
   for the sake of simplicity.  Since the program is dealing with 
   point sources of audio, a stereo sound source is inappropriate.  
   Thus, the conversion to floating point also includes a conversion 
   from stereo to mono.  In this conversion, only the left channel 
   is used.  A summation or average of the left and right channels 
   could have been just as easy to implement as our conversion from 
   stereo to mono.
   
   
   ONE-SHOT & CONTINUOUS AUDIO
   
   Many applications add audio simply by generating a system() call
   to the IRIX commands playaiff or playaifc utilities.  For some 
   applications this is enough to add a little bit of audio, but this 
   approach can be limiting in that your audio is only as effective as 
   the sound file you play.  This solution can be a quick and dirty way 
   to do one-shot audio - audio that can be triggered by a single event, 
   like a car crashing into a tree, or the sound of a ball hitting a 
   tennis racket  - but it comes with the penalty of losing interaction 
   with the sound.  Sometimes interaction is not a concern for these 
   types of sounds.
   
   Continuous audio is different than one-shot audio in that it describes
   a sound that's always present to a certain degree, like the sound
   of a jet engine for a flight simulator, or the sound of crickets for
   ambience.
   
   In an application where the audio output changes continually with 
   the user's input, it can be convenient to approach preparing the
   samples in chunks of equal amounts of time.  Changes in audio
   dynamics will happen on sound buffer boundaries and multiple sounds
   will need to be mixed together to form one sample stream.
   
   Processing continuous audio is fairly straightforward. Sounds can
   be longer than the buffer that is being used for output, therefore
   an index needs to be kept on where the continuous sound left off
   for the next time around.  Looping can be achieved by a test to
   see if the index goes out of bounds or (even better) by a modulo
   function.  
   
   Processing one-shot audio is similar to continuous audio, with the
   additional criterion that the program needs to keep track of
   information such as when to start the sound, when to continue 
   processing the sound (if the one-shot sound is longer than the 
   audio buffer), and when to stop processing the sound.  Sonic defines 
   the variable "hit" to describe if there is NO_HIT on the wall (no 
   processing needed), if the sphere JUST_HIT the wall (start processing),
   or if the wall has BEEN_HIT by the sphere (continue processing).
   While it is the graphics process that initiates the one-shot
   sound by changing the state of the "hit" variable, it is the 
   audio process that acknowledges the completion of the one-shot 
   sound by changing the state of the variable to indicate completion
   
   
   SIMPLE CALCULATIONS FOR SPATIALIZATION OF AUDIO
   
   Sonic uses some _very_ basic calculations that attempt a 3-D audio 
   simulation of the sphere/room environment.  Amplitude and left-right
   balance relative to the observer are the only audio cues that are 
   calculated in this example.  You will notice that if the sphere is in 
   close proximity to the observer, the amplitude of the sounds eminating 
   from the sphere are louder than they would be if the sphere were in the 
   distance.  You will also notice that as the orientation of the 
   observation point is changed, the left-right location of the sound 
   changes accordingly.  After a bit of playing with sonic, you may notice 
   that your sense of whether the sound is coming from in front of you or 
   in back of you depends on the visual cue of the sphere being within the 
   field of view of the graphics window.  It is audibly obvious that sonic
   does not attempt any calculations for top-bottom or front-back 
   spatialization.  With a two channel (left-right) audio display system,
   such as a pair of headphones,  anything other than a sense of left-right
   balance is computationally difficult to simulate, so we'll ignore it here.
   
   The first thing we will need to calculate will be the position of 
   the sphere, described by the coordinates <sphx,sphy,sphz>, relative 
   to the orientation of observer, described by he angles rx and ry.  
   In order to do this correctly, we'll need to borrow from the computer 
   graphics concept of viewing transformations to compute which direction 
   the observer should perceive the sound to be coming from.  Using these 
   relative coordinates we can first compute the overall amplitude of the 
   sound to reflect the distance of the sound, and then compute the 
   amplitude of the sound for each speaker to reflect the location of the 
   sound in the audio field.
   
   It is the responsibility of the graphics process to update the 
   coordinates of the moving sphere, <sphx,sphy,sphz>, and the angles 
   describing the orientation of the observer, rx and ry.  Since location 
   in the audio display needs to correspond to location in the graphics 
   display, we need to follow the order that modelling and viewing 
   transformations are performed in the graphics process.  In the graphics
   process, the GL commands
   
       rot(ry, 'y');
       rot(rx, 'x');
   
   correspond to the following matrix equation for each point that is passed
   through transformation matrix. (Remember that GL premultiplies its
   matrices!)
   

  |      |
  | relx |
  |      |
  | rely |
  |      |  =   <sphx,sphy,sphz> * Rot (radx) * Rot (rady)   = 
  | relz |                            x            y
  |      |
  |  1   |
  |      |


                   |                          |   |                          |
<sphx,sphy,sphz> * | 1      0         0     0 |   | cos(rady) 0 -sin(rady) 0 |
                   |                          |   |                          |
                   | 0  cos(radx) sin(radx) 0 |   |     0     1      0     0 |
                   |                          | * |                          |
                   | 0 -sin(radx) cos(radx) 0 |   | sin(rady) 0  cos(rady) 0 |
                   |                          |   |                          |
                   | 0      0         0     1 |   |     0     0      0     1 |
                   |                          |   |                          |


                   |                                                       |
<sphx,sphy,sphz> * |     cos(rady)           0          -sin(rady)       0 |
                   |                                                       |
                   | sin(radx)*sin(rady)  cos(radx) sin(radx)*cos(rady)  0 |
                   |                                                       |
                   | cos(radx)*sin(rady) -sin(radx) cos(radx)*cos(rady)  0 |
                   |                                                       |
                   |          0              0              0            1 |
                   |                                                       |

or

|      |   |                                                                   |
| relx |   | sphx*cos(rady)+sphy*sin(radx)*sin(rady)+sphz*cos(radx)*sin(rady)  |
|      |   |                                                                   |
| rely |   | sphy*cos(radx) - sphz*sin(radx)                                   |
|      | = |                                                                   |
| relz |   | -sphx*sin(rady)+sphy*sin(radx)*cos(rady)+sphz*cos(radx)*cos(rady) |
|      |   |                                                                   |
|  1   |   |                                1                                  |
|      |   |                                                                   |


   Where sphx, sphy and sphz are the world coordinates of the sphere,
   relx, rely and relz are the coordinates of the sphere relative to the 
   observer, and radx and rady describe the rotations (in radians) about 
   the x and y axes, respectively.
   
   The overall amplitude of a sound can give some impression of a sense of 
   distance.  Each buffer of sound to be processed at any given slice of 
   time is multiplied by a amplitude scaling value that is based on the 
   3-D distance of the sphere relative to the observer.  That amplitude 
   scaling value is approximated with an inverse-square of the distance 
   from the center of the observer's head to the center of the sphere by 
   the equation
   
   amplitude = 1.0 / (distance*distance + 1.0)
   
   The 1.0 added to the square of the distance in the denominator is to 
   insure we get a valid scaling value between 0.0 and 1.0, even when the 
   sphere is right on top of the observer at a distance of 0.0.
   
   Since the most common method of audio display is either a set of 
   headphones or a pair of speakers, sonic only attempts to simulate 
   a sense of left and right.  It may be possible to simulate a sense 
   of top and bottom, as well as a sense of front and back, perhaps 
   with a combination of filters and echoes, however this can be 
   computationally expensive and quite complex. Thus, for the sake
   of this simple example, sonic ignores these techniques for aurally 
   simulating orientation.
   
   One way of considering how left-right orientation is perceived 
   is to think of a listener interpretting a given stereo sound 
   as having some angle from what the listener considers to be directly 
   in front of them.  The balance knob on an everyday stereo controls this 
   sense of left-right placement.  We'll use the term "balance" to describe 
   the listener's perceived sense of left and right.  
   
   We can think of balance being on a scale from 0.0 to 1.0, where
   0.0 is full volume right ear only, 1.0 is full volume left ear only, 
   and 0.5 is the middle point with both ears at half volume.
   Now we need some way of relating this 0.0 - 1.0 scale for balance to 
   the general orientation of the sonic sphere with respect to the observer.  
   
   For convenience, we can think of our sound space for computing balance
   to be 2 dimensional.  Since we're not worrying about aurally simulating 
   top-bottom orientation, our 3-D space for graphics can be projected into 
   the 2-D plane for our audio, the x-z plane, where the listener's perception 
   of straight ahead is in the positive z direction (0.5 on our balance scale), 
   full right extends in the positive x direction (balance of 0.0), full left 
   extends in the negative x direction (a balance of 1.0).
   
                        half left/half right
                          balance = 0.5
                            angle = PI/2
   
                                 +z
                                        O sphere
                                 ^     /   
                                 |    /
                             _/-----\_     perceived angle
                            /    |  / \
                           /     | /   \
       full left           |     |/)   |             full right
    balance = 1.0   -x  ---|-----+-----o---->  +x   balance = 0.0
      angle = PI           |     |     |(1.0,0.0)     angle = 0 
                           \     |     /
                            \_   |   _/
                              \-----/
                                 |
                                 |
   
                      observer at origin, (0.0,0.0)
   
    
   The angle that is to be interpretted by the listener and mapped to our 
   scale for balance is the angle that is made with the vector extending 
   from the center of the observer to center of sphere and the line that 
   goes through both ears of the observer, (or the line z=0).  A simple way 
   of mapping this angle to our 0.0 to 1.0 scale for balance would be the
   arccosine function.  An angle of PI radians could map to our scale for 
   balance at 1.0 - all the way left; an angle of 0 radians could map to 
   our scale for balance at 0.0 - all the way right.  
   
   To map our vector from observer to sphere onto the unit circle required
   for the arccosine function, we need to normalize the vector, so the 
   argument to the arccosine function is the distance which the normalized 
   vector from the observer to the center of the sphere, extends in the x 
   direction.  So the equation sonic uses to compute left-right balance is 
   
   	balance = acos( relx/distance) / PI
   
   
   Other spatialization techniques with which you may wish to experiment 
   may be some sort of phase calculation between both ears.  Adding 
   reverb or delay that interacts with the sounds can add a sense of 
   depth to the room.  Filters for top-bottom or front-back transfer 
   functions could also be implemented.  However, none of these would 
   come without adding computational complexity and an extra tax on CPU .
   
   
   SENDING MORE THAN ONE SOUND TO A SINGLE AUDIO PORT
   
   Since the continuous audio is the always the first sound to be 
   processed in sonic, no mixing is needed - samples are just copied 
   into the buffer - but the one-shot samples need to be mixed with 
   those that are already there.  Processing the one-shot sound always 
   comes after the processing of the continuous sound in this application. 
   A simple mixing operation can be written by just summing the samples
   that are currently in the sample buffer to those that are about
   to be mixed in.  
   
   Beware!  
   
   Clipping, perceived as sharp clicks, can occur if the sum of the two 
   samples exceeds the value for maximum or minimum amplitude.  To 
   prevent this undesirable effect, weighted averages for the samples to 
   be summed can be used.  If a maximum of two sounds will be mixed, 
   weighting each sound by 1/2 before summation will guarantee no 
   clipping.  For 3 sounds use 1/3, etc.  This guarantee does not 
   come without a trade-off, though.  You'll have to decide on yet 
   another happy medium, this time between the clipping that a straight 
   summation can produce and the general decrease in overall volume 
   that weighted averages can produce. 
   
   Now that that we've done all our processing, it's time to send the 
   sample buffer to the audio port for output using the ALwritesamps() 
   call.  If there is not enough room in the sample queue for the 
   entire sample buffer to fit, ALwritesamps() will block until there 
   is enough space.  It is possible to get feedback on the progress of 
   the draining (or filling if the audio port is configured for input
   rather than output) of the queue by the audio device.  The 
   ALgetfillable() and ALgetfilled() functions can be used to give an
   idea of how many samples are left to go before sufficient space is 
   available in the queue.  The sonic audio process calls sginap()
   to give up the CPU if it needs to wait for room in the sample
   queue.
   
   
   COMMON SOURCES OF AUDIO CLICKS AND DISCONTINUITY
   
   Discontinuities in audio can arise as sharp clicks or complete
   dropouts of sound.  In general, greater attention to smooth audio 
   performance should be paid rather than worrying about graphics
   performance and frame rates.  A sudden, unexpected loud click is much 
   more irritating to an end-user than graphics that aren't going as
   fast as they could.  Here are some common causes and suggested
   workaround for discontinuities in audio:
   
   1) Audio processing is feeding the output sample buffer slower than
   the audio device is draining the samples.  As discussed earlier,
   this usually happens with small sample queue sizes.  Increasing
   the queue size for your audio port _can_ help here.  Keep in
   mind that extensive audio processing may bog down the CPU,
   in which case, your audio process may never be able to keep
   the sample queue filled adequately.
   
   2) Clipping from mixing sounds.  The "Beware!" from the text above.
   See the section on "Sending more than one sound to an audio port"
   
   3) Buffer boundaries in interactive audio.  In the graphics process,
   motion is only as continuous as the frame rate will dictate.
   If your audio process is like sonic, audio dynamics can change 
   from iteration to iteration of the sound processing.  Like
   the frame rate in graphics, the continuity of the audio
   is only as smooth as the continuity of the data that changes
   the dynamics of the audio.  This source of discontinuity
   tends to be more subtle.  Perception of this type of discontinuity 
   can be influenced by decreasing the size of the audio buffer.
   
   4) Other processes are contending for the same CPU.  Indigos are single
   processor machines and all other processes need to use the same
   CPU as yor audio process.  The audio process can lose CPU time due 
   to IRIX scheduling of other processes (including the graphics 
   process).  One solution is to use the schedctl() function.  Upon 
   entering the audioloop() function, sonic's audio process tries to 
   schedule itself as a high, non-degrading priority to put its priority
   above other user processes contending for the CPU.  To gain the high,
   non-degrading priority the process must be run as the super-user.
   Sonic continues on if the use of schedctl() fails.  See the man 
   page for schedctl() for the gory details.
   
   5) The audio process is swapping out of virtual memory.  Just because
   a process has a high priority doesn't mean parts of virtual
   memory will not be swapped out to disk.  You can use the
   plock() command to combat this.  Sonic attempts to lock
   the process into memory just after attempting the schedctl()
   command.  The argument of PROCLOCK indicates that all aspects
   of the program are to be locked into memory if possible. 
   Like schedctl(), plock() will only be successful if the 
   effective user id is that of the super-user. Sonic continues on 
   if the use of plock() fails.  See the man page for plock() for 
   the details.
   
   Not everyone that runs the program will have access to the
   super-user account.  You can insure that users can execute
   the program as super-user to take advantage of a high,
   non-degrading priority and locking the process into memory
   by changing the ownership of the executable to root, and by 
   changing the permissions to set-user-id on execution. See the 
   man page for chmod for more details.

Files of interest from "src/exampleCode/audio/sonic" directory

Source

Documentation

Audio

Reference


Select any combo of files you'd like to send yourself a compressed tar image of. Executables/scripts are indicated with a trailing `*' character. (Depending upon the browser, it may be necessary to hold down the Ctrl key to select/deselect disjoint items.) a compressed tar image of the above-selected items.
OR, ...
a compressed tar image of the entire sonic directory.

Copyright © 1995, Silicon Graphics, Inc.